On Creating Reference Templates for Speaker Independent Recognition of Isolated Words

نویسنده

  • LAWRENCE R. RABINER
چکیده

The three aspects of a statistical approach to a pattern recognition problem are the selection of features, choice of a measure of similarity, and a method for creating the reference templates (patterns) used in the statistical tests. This paper discusses a philosophy for creating reference templates for a speaker independent, isolated word recognition system. Although there remain many unanswered questions both about how to select appropriate features for recognition, and how to measure similarity between sets of features, such issues are not discussed here. Instead we concentrate on methods for creating the reference templates. In particular, a method of combining word patterns from a number of speakers is proposed in which a clustering type of analysis is used to determine which patterns are merged to create a word template. The creation of multiple templates, based on this method, is discussed and is shown to be of substantial value for as few as eight speakers in the training set. To test the ideas proposed here, a 54 word vocabulary word recognition system was implemented. All input words were recorded off a standard telephone line. The features used were the LPC coefficients of an 8-pole analysis, and the simple Itakura distance measure was used to measure similarity between patterns. With word templates obtained as described above, recognition accuracies of 85 percent were obtained in a forced choice recognition test on the 54 word vocabulary using eight new speakers. The correct word was within the top five choices 98 percent of the time. Using a strategy in which all the training words were used to create the templates, the recognition accuracy fell to 77 percent, and the correct word was within the top five choices only 89 percent of the time.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Creating speaker-specific phonetic templates with a speaker-independent phonetic recognizer: implications for voice dialing

We present a new approach to speaker dependent template generation which uses dramatically less storage to represent a speaker's words, with minimal degradation in recognition accuracy. In this approach, the symbolic string produced by a speaker-independent phonetic recognizer is used to represent utterances. We investigate eeective procedures for template generation, and compare the results of...

متن کامل

Development of Isolated Word Speech Recognition System

The isolated word speech recognition system based on dynamic time warping (DTW) has been developed. Speaker adaptation is performed using speaker recognition techniques. Vector quantization is used to create reference templates for speaker recognition. Linear predictive coding (LPC) parameters are used as features for recognition. Performance is evaluated using 12 words of Lithuanian language p...

متن کامل

Speaker-independent word recognition by less cost and stochastic dynamic time warping method

In this paper, we describe some considerations on a speaker-independent word recognition method on a large vocabulary size by the concatenation of syllable templates and a stochastic dynamic time warping method, where syllable templates are taken from spoken words. We got the reference patterns from 216 words uttered by 30 male speakers and recognized the other 200 words uttered by the other 10...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2002